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' Abstract 

(-H ' We consider a one dimensional ballistic random walk evolving in a para- 

metric independent and identically distributed random environment. We 
study the asymptotic properties of the maximum likelihood estimator of 
the parameter based on a single observation of the path till the time it 
reaches a distant site. We prove an asymptotic normality result for this 
consistent estimator as the distant site tends to infinity and establish that 
' it achieves the Cramer-Rao bound. We also explore in a simulation setting 

J;!^ I the numerical behaviour of asymptotic confidence regions for the param- 
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^ 1 Introduction 



Random walks in random environments (RWRE) are stoctiastic models that al- 
low two kinds of uncertainty in piiysical systems: the first one is due to the het- 
erogeneity of the environment, and the second one to the evolution of a particle 
in a given environ ment. The first studies of one -dimensional RWRE were do ne 



by lChernovl 1 119671) with a model of DNA replication, and by lTemkinl 1 119721) in 
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the field of metallurgy. From the latter work, the random media literature in- 
herited some famous terminolo gy such a s annealed or quenched law. The lim- 
itii ig behayiour o f the particle in Temkinl's model was suc cessively investigated 
by Kozlov 1 1973); Solomon 1975 ) and Kesten et al. 1975h . Since these pioneer 
works on one-dimensional RWRE, the related literature in physics and proba- 
bility theory has become richer and source of fine probab ilisti c results that th e 
reader may find in recent surveys including Hughes! 1996 ) and Zeitouni 2004 ) . 

The present work deals with the one -dimensional RWRE where we investigate 
a different kind of question than the limiting behaviour of the walk. We adopt 
a statistical point of view and are interested in inferring the distribution of the 
environment given the observation of a long trajectory of the random walk. This 

kind of questions has already been studied in the context of random walks in 

random colorings of Z Benjamini and Kestenl. 19961: IVIatzinger , 19991: Lowe and Matzinger , 



20021) as well as in the context of RWRE for a characterization of the environ- 



ment distribution tAdelman and Enriquezll2004l: IComets et al.ll2012l) . Whereas 
Adelman and EnriquezI deal with very general RWRE and present a procedure 
to infer the e nvironment distribution through a system of moment equations, 



Comets et al.l provide a maximum likelihood estimator (MLE) of the parame- 



ter of the environment distribution in the specific case of a transient ballistic 
one-dimensional nearest neighbour path. In the latter work, the authors estab- 
lish the consistency of their estimator and provide synthetic experiments to as- 
sess its effective performances. I t turns out that this esti mator exhibits a much 
smaller variance than the one of lAdelman and Enriguea We propose to estab- 
lish what the numerical investigations of IComets et alJ suggested, that is, the 
asymptotic normality of the MLE as well as its asymptotic efficiency (namely, 
that it asymptotically achieves the Cramer- Rao bound). 

This article is organised as follows. In Section [STTl we introduce the framework 
of the one dimensional ballistic random walk in an independent and identi- 
cally distributed (i.i.d.) param etric environrn ent. In Section |Z2] we present the 
MLE procedure developed by lComets et all to infer the parameter of the en- 



vironment distribution. Section 12.31 recalls some already known results on an 
underlying branching process in a random environment related to the RWRE. 
Then, we state in Section |23] our asymptotic normality result in the wake of ad- 
ditional hypotheses required to prove it and listed in Section [Z4l In Section |3] 
we present three examples of e nvironment distributions which are already in- 
troduced in lComets et all 120120 . and we check that the additional required as- 
sumptions of Section [Z4l are fulfilled, so that the MLE is asymptotically normal 
and efficient in these cases. The proof of the asymptotic normality result is pre- 
sented in Section m We apply to the score vector sequence a central limit the- 
orem for centered square-integrable martingales (Section 1411 and we adapt to 
our context an asymptotic normality result for M-estimators (Section |4^ . To 
conclude this part, we provide in Section 1431 the proof of a sufficient condition 
for the non-degeneracy of the Fisher information. Finally, Section [5] illustrates 
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our results on synthetic data by exploring empirical coverages of asymptotic 
confidence regions. 



2 Material and results 



2. 1 Properties of a transient random walk in a random environment 

Let us introduce a one-dimensional random walk (more precisely a nearest neigh- 
bour path) evolving in a random environment (RWRE for short) and recall its el- 
ementary properties. We start by considering the environment defined through 
the collection o) = {WjcIxez ^ (0,1)^ of i.i.d. random variables, with parametric 
distribution v = vg that depends on some unknown parameter e 0. We further 
assume that c K'^ is a compact set. We let = v®^ be the law on (0, 1)^ of the 
environment o) and be the corresponding expectation. 

Now, for fixed environment o), let X = {Xt}t<EN be the Markov chain on Z starting 
at Xq = and with (conditional) transition probabilities 

ojx ify = x + l, 
Pa,iXt+i = y\Xt = x) = -{ l-Mx ify = x-l, 

otherwise. 

The quenched distribution P^^ is the conditional measure on the path space of X 
given 0). Moreover, the annealed distribution of X is given by 

P^-)= J P^(-)dP^w). 



We write E,^ and E for the corresponding quenched and annealed expectations, 
respectively. In the following, we assume that the process X is generated under 
the true parameter value 0*, an interior point of the parameter space 0, that 
we aim at estimating. We shorten to P* and E* (resp. P* and E*) the annealed 
(resp. quenched) probability P^* (resp. P^*) and corresponding expectation E^* 
(resp. E^ ) under parameter value 0*. 

The behaviour of the process X is related to the ratio sequence 

1 -t^x 

Px = , xeZ. (1) 

(Or 



We refer to Solomon 1 19751) for the classification of X between transient or re- 



current cases according to whe ther E (logpn) is dif ferent or not from zero (the 



classification is also recalled in lComets et alJ.l2012l) . In our setup, we consider 



a transient process and without loss of generality, assume that it is transient to 
the right, thus corresponding to E^{logpo) < 0. The transient case may be fur- 
ther split into two sub-cases, called ballistic and sub-ballisticthat correspond to 
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a linear and a sub-linear speed for the walk, respectively. More precisely, letting 
Tn be the first hitting time of the positive integer n, 



Tn = mf{teN:Xt = n}, (2) 
and assuming (logpo) < all through, we can distinguish the following cases 

(al) (Ballistic). If E''(po) < 1- then, -almost surely, 

Tn l + E^(po) 



n n^oo i-E^(po)' 



(3) 



(a2) (Sub-ballistic). If E (po) > 1, then r„/n +oo, P -almost surely when n 
tends to infinity. 

Moreover, the fluctuations of r„ depend in nature on a parameter k e (0,oo], 
which is defined as the unique positive solution of 

E^(p^) = l 

when such a number exists, and k = +oo otherwis e. The ballistic cas e corre- 
sponds to K > 1. Under mild additional assumptions, Kesten et al. 19751) proved 
that 

(al) if K > 2, then r„ has Gaussian fluctuations. Precisely, if c denotes the limit 
in O, then n~^'^{T„-nc) whenK > 2, and (nlogn)~^'^(r„-nc) whenK = 2 
have a non- degenerate Gaussian limit. 

(all) if K < 2, then n~^'^{Tn-dn) has a non-degenerate limit distribution, which 
is a stable law with index k. 

The centering is dn = for k < I, dn = anlogn for k = 1, and dn = an for 
K e (1,2), for some positive constant a. 



2.2 A consistent estimator 



We briefly recall the definition of the estimator proposed in lComets et all 120121) 
to infer the parameter 6, when we observe Xy^j^] = {Xt : t = 0,1,..., Tn), for 
some value n > 1. It is defined as the maximizer of some well- chosen criterion 
function, which roughly corresponds to the log-likelihood of the observations. 

We start by introducing the statistics {i"}xez> defined as 

Tn-l 

L" '■= ^{X,=x-X,+i=x-l], 

namely L" is the number of left steps of the process X[o,r„] from site x. Here, 1{.} 
denotes the indicator function. 
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Definition 2.1. Letcpg be the function from to U given by 

(/)e(x,y) = log r a''^\l-a)ydve{a). (4) 
Jo 

The criterion function 6^ £nid] is defined as 

n-l 

£nie)=Y.^eiL''^+vL"). (5) 

x=0 



We now recall the assumptions stated in lComets et all 12012f) ensuring that the 



maximizer of criterion is a consistent estimator of the unknown parameter. 
Assumption I. (Consistency conditions). 

i) (Transience to the right). For any e ©, | log po I < oo and (log po) < 0. 

ii) (Ballistic case). For any 6 e 0, E^(po) < 1. 

Hi) (Continuity). Forany{x,y] e N^, themapO^ (pgix.y) is continuous on the 
parameter setQ. 

iv) (Identiflability). For any [6,6') e 0^, vg: 6 6'. 

v) The collection of probability measures {vg : Be 0} is such that 

infE^[log(l-wo)] >-oo. 

According to Assumption U point \iii)\ the function 6 ^ (niS) is continuous on 
the compact parameter set 0. Thus, it achieves its maximum, and the estimator 
dn is defined as one maximizer of this criterion. 

Definition 2.2. An estimator 6 n of 6 is defined as a measurable choice 

0„eArgmax/„(0). (6) 



Note that On is not necessarily unique. As explained in IComets et alj 1 120121) . 



with a slight a buse of notation, 0„ may be considered a MLE. Moreover, under 



Assumption HI IComets et alj <2012l) establish its consistency, namely its conver- 



gence in P* -probability to the true parameter value 6* . 
2.3 The role of an underlying branching process 

We introduce in this section an underlying branching process with immigration 
in random environment (BPIRE) that is naturally related to the RWRE. Indeed, it 
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is well-known that for an i.i.d. environment, under the annealed law P*, the se- 
quence L",L" Lq has the same distribution as a BPIRE denoted Zq, . .. , Zfi, 
and defined by 



Zq = 0, andfor fc = 0,...,n-l, 2fc+i = X'ffc+ip (7) 

i=0 

• ;;eM independent and 

(see for instance lKesten et alJ.ll975l:IComets et alJ.l2012n . Let us introduce through 
the function (pg defined by HJ the transition kernel Qe on defined as 



Qeix, y) = 



x + y 

X 



^<l>8(x,y) 



[ X jJo 



(l-fl)>'dve{a). 



(8) 



Then for each value e 0, under annealed law P the BPIRE {Z,j}„eM is an ir- 
reducible positive recurrent homogeneous Markov chain with transition ker- 
nel Qg and a unique stationary probability distribution denoted by ng. More- 
over, the moments of ttq may be characterised through the distribution of the 
ratios {pxixez- T he following stateme nt is a direct consequence from the proof 
of Theorem 4.5 in lComets et all l2012r) (see Equation (16) in this proof). 



Proposition 2.3 (Theorem 4.5 in lComets et al.l 1 120121) ). The invariant probabil- 
ity measure ng is positive on N and satisfies 



V;>0, k[k-l)...ik-j)ngik) = ij + lV.E^\iY.Ylpk 

k>j+l n>\k=l 

In particular, ttq has a finite first moment in the ballistic case. 



Note that the criterion £„ satisfies the following property 

n-l 

iniO) ~ 4>eiZk,Zk+i) under P*, (9) 

k=0 

where ~ means equality in distribution. For each value 6 e&, under annealed 
law P^ the process {(Z„,Z„+i)}„ew is also an irreducible positive recurrent ho- 
mogeneous Markov chain with unique stationary probability distribution de- 
noted by fig and defined as 

ng{x,y) = ng{x)Qg{x,y), \/{x,y)eN^. (10) 

For any function g : — ' IR such that T.x,y^eix,y)\g{x,y)\ < oo, we denote by 
fig ig] the quantity 

^eig)= ^eix,y)g[x,y). (11) 

(x,y)eN2 



6 



We extend the notation above for any function g = (gi , . . . , ga) : ^ IR*^ such 
that%(||g||) <oo, where \\-\\ is the uniform norm, and denote by ttq (g) the vector 
i^eig'i)>---i^eigd))- The following ergodic theorem is valid. 

Proposition 2.4. (Theorem 4.2 in Chapter 4 from Revm , 1984) . Under point i) in 
Assumption\^ for any fiinction g-.N^ W^, such thatneiWgW) < oo the following 
ergodic theorem holds 



Y n-l 

lim - ^ gCZfc.Zfc+i) 
« fc=o 



P* -almost surely and in L^CP"* 



2.4 Assumptions for asymptotic normality 

Assumption U is required for the construction of a consistent estimator of the 
parameter d. It mainly consists in a transient random walk with linear speed 
(ballistic regime) plus some regularity assumptions on the model with respect 
tod e&. Now, asymptotic normality result for this estimator requires additional 
hypotheses. 

In the following, for any function gg depending on the parameter 6, the symbols 
gg or doge and gg or dggg denote the (column) gradient vector and Hessian ma- 
trix with respect to 9, respectively. Moreover, is the row vector obtained by 
transposing the column vector Y. 

Assumption II. (Differentiability). The collection of probability measures {vg : 
6 £&} is such that for any (x, y) e N^, the map 6 ^ (pg{x, y) is twice continuously 
differentiable on 0. 

Assumption III. (Regularity conditions). For any 6 e&, there exists some q > I 
such that 

7ig[\\(pgf'^]<+oo. (12) 

For any xeN, 

Y.Qgix,y) = dgY,Qg{x,y) = 0. (13) 

j/eN yEN 

Assumption IV. (Uniform conditions). For any 9 e &, there exists some neighbor- 
hoodTiO) of 6 such that 

Tigi sup < +00 and ngi sup <+oo. (14) 

Assumptions HH and HIH are technical and involved in the proof of a central limit 
theorem (CLT) for the gradient vector of the criterion £„, also called score vector 
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sequence. Assumption |l3 is also technical and involved in the proof of asymp- 
totic normality of 0„ from the latter CLT. Note that Assumption[lIl]also allows us 
to define the matrix 

Zg = ne((Pg<pl]. (15) 

Combining the definitions lISt.fTOll.lfTni and i fTSt with Assumptionlllll we obtain 
the equivalent expression for Xg 

'^e = Y. Jl^six)—^ — -Qg{x,y)Qgix,yV 

= -Z E^eW Q0{x,y)-— -Qe(x,y)Qe(x,y)T 

= -%(0e). (16) 

Assumption V. (Fisher information matrix). For any value e 0, the matrix 
is non singular 

Assumption 13 states invertibility of the Fisher information matrix Zg*. This as- 
sumption is necessary to prove asymptotic normality of 0„ from the previously 
mentioned CLT on the score vector sequence. 



2.5 Results 

Theorem 2.5. Under Assumption^ toUlK the score vector sequence ( n^d*) I \/n is 
asymptotically normal with mean zero and finite covariance matrix Zg* . 

Theorem 2.6. (Asymptotic normality). Under Assumptions\^ fo[3 for any choice 
ofdn satisfying lO, the sequence {\/n{dn - 0*)}«eN converges in V* -distribution 
to a centered Gaussian random vector with covariance matrixJ.^}. 

Note that the limiting covariance matrix of y/ndn is exactly the inverse Fisher 
information matrix of the model. As such, our estimator is efficient. Moreover, 
the previous theorem maybe used to build asymptotic confidence regions for 6, 
as illustrated in Section [5j In this section, we also explain how to estimate the 
Fisher information matrix Zg* . Indeed, Zg* is defined via the invariant distribu- 
tion Tig* which possesses no analytical expression. To bypass the problem, we 
rely on the observed Fisher information matrix as an estimator of Zg* . 

Remark 2.7. We observe that the fluctuations of the estimator On are unrelated 
to those ofTfi or those ofXf, see \(al)}^all)\ Though there is a change of limit law 
from Gaussian to stable as (pg) decreases from larger to smaller than \, the MLB 
remains asymptotically normal in the full ballistic region (no extra assumption is 
required in Example^introduced in Section\^. We illustrate this point by consid- 
ering a naive estimator at the end of Subsection \3.1\ 
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We conclude this section by providing a sufficient condition for Assumption 
to be valid, namely ensuring that is positive definite. 

Proposition 2.8. For the covariance matrix to be positive definite, it is suffi- 
cient that the linear span in of the gradient vectors (pe{x,y), ivith{x,y) eN^ is 
equal to the fiill space, or equivalently, that 

Vect{aeE^(wiJ+^(l-oJo)^) : (x,y) e N^} = K'*. 

Section|4]is devoted to the proof of Theorem l2.6l where Subsections l4.1l and l4.3l 
are concerned with the proofs of Theorem l2.5l and Proposition l2.8l respectively. 



3 Examples 



3. 1 Environment with finite and known support 

Example I. Fix ai < az ^ (0, 1) and let Vp = pSa^ + (1 - p)5a2> where 6a is the 
Dirac mass located at value a. Here, the unknown parameter is the proportion 
p e c [0, 1] (namely 6 = p). We suppose thatai, a2 and® are such that point^i)\ 
and \ii)\ in Assumption\^are satisfied. 



This example is easily generalized to v having m>2 support points namely vg = 
1 PiCii, where ai,..., a,n are distinct, fixed and known in (0, 1), we let pm = 
^-T.'PJi^ Pi and the parameter is now0 = 

In the framework of ExampleUl we have 

(/)p(x,y) = log[paf\l - aiV + (1 - p)af ^(1 - ag)^], (17) 

and 

n-\ 
x=0 



pa^ 



I +1 



p)a. 



I +1 



a2) 



(18) 



Comets et all 120121] proved that pn = Argmax^^g £nip) converges in P -probability 
to p*. There is no analytical expression for the value of p„. Nonetheless, this es- 
timator may be easily computed by numerical methods. We now establish that 
the assumptions needed for asymptotic normality are also satisfied in this case, 
under the only additional assumption that c (0, 1). 

Proposition 3.1. In the framework ofExample\^ assuming moreover thatQ c 
(0, 1), Assumptions^ to\T\Aare satisfied. 
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Proof. The function p ^ (pp (x, y) given by i fTTt is twice continuously differen- 
tiable for any (x, y). The derivatives are given by 



iPp (X, y) = e-^p'^-y^ [af Hi - ai)>' - af ^1 - az)^] , 
0p(x,y) = -(ppix,yf. 

Since exp[(/)p{x, y)] > pa^'^^il - ai)y and exp[(^p(x,y)] > (1 - p)fl|"'"^{l - a2)^> we 
obtain the bounds 

1 1 
\(p„[x,y)\<- + — 
a I 



p L-p 

Now, under the additional assumption that c (0, 1), there exists some A e {0, 1) 
such that c [A, 1 - A] and then 



2 ..4 

sup |(/)p(x,y)| < - and sup \(pp{x,y)\ < 



(19) 



which yields that l fT2t and O are satisfied. 
Now, noting that 



Qeix,y) = 



'x + y' 
V ^ 7 



and that 



yields {TSj . 



y=oi 



fl''+Hl-a)-*' = l, VxeN, Vae (0,1), 



(20) 



□ 



Proposition 3.2. /n the framework ofExample^ the covariance matrix Zg is pos- 
itive definite, namely Assumption[Qis satisfied. 

Proof of Proposition \3.2[ We have 

E'^ioJo) = p[ai-a2) + a2, 

with derivative ai - fl2 7^ 0, which achieves the proof thanks to Proposition 12.81 

□ 



Thanks to Theorem l2.6l and Propositions l3.1l and l3.2l the sequence {\/n{pn-p*)} 
converges in P* -distribution to a non degenerate centered Gaussian random 
variable, with variance 



(x,y)eN2 



'x + y* 



p*fl^+i(l - ai)y + (1 - p*)fl^+Hl - fl2) 



fl[+i(l-ai)>'-af Hl-a2)^]^ |-i 
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Remark 3.3. (Temkin model, cf. IHughesi (19961) } With a e (1/2,1) known and 



= p e (0, 1) unknown, we consider vg = p5a + (1 - p)Si-a- This is a particular 
case of Example]^ It is easy to see that transience to the right and ballistic regime, 
respectively, are equivalent to 

p>l/2, p>a, 

and that in the ballistic case, the limit c = c{p) in 0) is given by 

a+ n-2ap 
^ [2a-mp-a) 

We construct a new estimator pn ofp solving the relation c{pn) = Tn/n, namely 

a i2a-l)T„ + n 

pn = • 

2a- 1 T„ + n 

This new estimator is consistent in the full ballistic region. However, for all a > 1/2 
andp> a butcloseto it, wehavex £ (1,2), the fluctuations of T„ are of order n^'^ , 
and those ofpn are of the same order This new estimator is much more spread 
out than the MLE p„ . 



3.2 Environment with two unknown support points 

Example II. We let vq = p5a^ + (1 - p)5a2 cind now the unknown parameter is 
6 = {p,ai,a2) £ 0, where is a compact subset of 

(0, 1) X {(ai , fl2) e (0, 1)^ : fli < fl2}- 

We suppose that® is such that points^and \n)\ in Assumption\^are satisfied. 

The function (pf) an d the criterion ^„(-) are given by i fTTt and i fTst . respectively. 



ion (pQ 



Comets et all 1 120121) have established that the estimator is well-defined and 
consistent in probability. Once again, there is no analytical expression for the 
value of dn- Nonetheless, this estimator may also be easily computed by nu- 
merical methods. We now establish that the assumptions needed for asymp- 
totic normality are also satisfied in this case, under a mild additional moment 
assumption. 

Proposition 3.4. In the framework of Exampl^I^ assuming moreover thatE^ip^) < 
1, Assumptions^ toU^are satisfied. 

Proof. In the proof of Proposition 13. 1 1 we have already controled the derivative 
of (peix, y) with respect to p. Hence, it is now sufficient to control its deriva- 
tives with respect to a^ and a2 to achieve the proof of fT2l l and {TU. We have 

afli0e(x,y) = e-'^«'^'^Vfl[(l-«i)^"^[(^+l)(l-fli)-y«i]. 
da2(peix,y) = e-'^«f^'^' (1 - p)a^(l - aa)^"^ lix + Dd - az) - yazl 
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Since 



and 



-(l>g(x,y) „„x 



1 



-<pB(x,y) 



(l-p)4(l-a2)^"^< 



ai(l — ai) 
1 



we can see that there exists a constant B such that 

x+1 y 



\daj(l>eix,y)\ 



aj 1 - aj 



<5(x+l + y), for; = 1,2. 



(21) 



Now, we prove that l fT2t is satisfied with q = 3/2. From l l2ll , it is sufficient to 
check that 

^ k^noik) < oo, 



which is equivalent to 

^fc(fc-l)(fc-2)7re{fc) = 6E^[(X flpkf 



fc>3 



< OO, 



where the last equality comes from Proposition l2.3l From Minkowski's inequal- 
ity, we have 



n>lfc=l 



{i:K(npi)]T={^[^'(^o)]""r. 

n>l k=l n>l 



where the right-hand side term is finite according to the additional assumption 
that E^(Pq) < 1. Since the bound in | |211 does not depend on 6 and ng possesses 
a finite third moment, the first part of condition O on the gradient vector is 
also satisfied. 

Now, we turn to {TSj. Noting that 



x + y\ 

'x + y 



pa^il-aiV ^[{x+l)(l-ai)-yai]. 



(1 - p)af (1 - a2)>'-^[(x+ 1)(1 - a2) - ya2], 



that 



x + y 

X 



a''^'^{l-aV = {x + l)^—^, VxeN, \/ae{0,l), 



(22) 



and using {20} yields {TS}. 
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The second order derivatives of (f)e are given by 



and similar formulas for a2 instead of ai . The second part of fT4l l on the Hessian 
matrix thus follows from the previous expressions combined with lfT9t . {21} and 
the existence of a second order moment for ng. □ 

Proposition 3.5. In the framework of Example]^ the covariance matrix Zg is 
positive definite, namely Assumption[^is satisfied. 

Proof of Proposition \3.5\ We have 

(1 - (Da)^] = pal*\l-ai)^ + a-p) flf^ d - az^. 

The determinant of \dg^^ [oj^.'^^il -ojq)^]] is given by 

V " /fc=0,l,x 

a\-a2 a\{\- ai)-a\{l- a2) a^'^^{l-aiY-a2'^^{\-a2Y 

p pai[2-3ai) pflj^(l - ai)^"^[x(l -2ai) + 1 - ai] 

\-p (l-p)a2(2-3fl2) {\-p)al{\-a2Y-^[x{\-2a2) + \-a2] 

and we denote it by Det. As we have ai ^ a2 and p e (0, 1), we show that this de- 
terminant is non zero for large x. This will complete the proof, thanks to Propo- 
sitionlZsl 

We first consider the case of ai{\-a\) ^ a2(l-fl2). i-^- oia2i^l-ai since a\ < a2. 
Without loss of generality, we assume a\{\ - a\) < a2{l - a2). In this case, the 
leading terms as x ^ oo in Det are 

Det = p{l-p)fl2(l-«2)^~^x 

ai-a2 a\{l- ai)-a\{l- a2) -a2{l-a2) 
1 ai(2-3fli) 

1 a2(2-3fl2) [x(l-2fl2) + l-'22] 

+ 0{a2{l-a2V). 

The sign of Det is determined by that of the above, new determinant. By tran- 
sience of the walk to +oo, it holds a2> 1/2, and we see in this new determinant 



= -ldp(l)g{x,y)] , 

= [dai(f>g{x,y]] X |i-(9p(/)0(x,y)j, 

= -[5flj00(x,y)] X [da,(pg[x,y)], 

r . X y- 1 

= [dai(f>g{x,y]]x -dai(t>gix,y) + 

L a\ L — a\ 

x+l + y 



(x-i-l)(l-ai)-yai 
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that the coefficient of x, namely {a2 - - 2^2) (1 - 2fli - a2), and the con- 

stant term a2{a2 - \){a2 - ai){2-3ai-'ia2 + {a2- a\){l-2ai - a2)] do not vanish 
simultaneously. Therefore, Det 5^ for large x. 

In the case a2 = 1 - ai , with some algebra we find 

Det = p[l - p)af{l - aifia2 - ai)^(l -2a2)flifl2-^ + ^(fl[(l - ^1)^). 

with a nonzero leading term. Hence Det ^ for large x. □ 

Thanks to Theorem 12.61 and Propositions 13 .41 and 13.51 under the additional as- 
sumption that (Pp) < 1, the sequence {v/n(0;j-0*)} converges in P* -distribution 
to a non degenerate centered Gaussian random vector. 



3.3 EnAdronment with Beta distribution 

Example III. We letv be a Beta distribution with parameters (a, p), namely 
1 

dv(a) = -fl""^(l-a)^"Ma, B(a,m=/ - 

B(a,p) Jo 

Here, the unknown parameter is 6 = (a,)6) e where Q is a compact subset of 

{[a,p) e (0,+oo)^ : a>/3-t-l}. 

As E^(po) = 13/ [a - 1), the constraint a > P+l ensures that points\i]\and \ii)\ in 
Assumption\^are satisfied. 

In the framework of Examplelllll we have 

, B(x + l-t-a, y + j6) 
iPe ix, y) = log (23) 



and 



n-l 

£„{e) = -nlogB{a,l3) + ^ logB(L^+i + a + l,L'^ + l3) 

x=0 

"^1 , (L^+i + a) (L^^i + a- 1) . . . a X (L^ 1) (L^ +;6-2) . . . |6 



= Liog 



iL"^^y+L^+a+l3-l)[L"^^^+L^+a+p-2)...ia + l3) 



In this case, IComets et all <2012l) proved that 6„ is well-defined and consistent 
in probability. We now establish that the assumptions needed for asymptotic 
normality are also satisfied in this case. 

Proposition 3.6. In the framework ofExampleUIH Assumption^ to\^are satis- 
fied. 
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Proof of Proposition \3. 61 Relying on classical identities on the Beta function, it 
may be seen after some computations that 

X y-1 x+y 

(pe{x,y)= ^log(fc + a)+ Y,\og[k + p)- ^ log(fc + a + /3). 

fc=0 fc=0 fc=0 

As a consequence, we obtain 

XI x+y 

-t ^ t . 

+ a){fc + a + /3) ^^k + x + a + fi 

The fact that is a compact set included in (0, +oo)^ yields the existence of a 
constant A independent of 6, x and y such that both 

) ^ y ^ A, 

^Qik + a){k + a + (5) {fc + a)(fc + a + /3) 

and 

y 1 ^1 

The same holds for dp(p0 [x, y). Hence, we have 

\da(peix,y)\<J^\og{l + y) and |d^00(x,y)| <^'log(l + x), (25) 

for some positive constant A'. Since there exists a constant B such that for any 
integer x 

log(l + x) <5v^, 
we deduce from l l25t that there exists C > such that 

\da<peix,yt'' <Cy and |a^(^0(x,y)|2'? < Cx, (26) 

where q = 2. From Proposition [23J we know that ne possesses a finite first mo- 
ment, and together with (26}, this is sufficient for l fT2t to be satisfied. Since the 
bound in (26j does not depend on 6, the first part of condition (TU on the gradi- 
ent vector is also satisfied. 

Now, we prove that it is possible to exchange the order of derivation and sum- 
mation to get {Tsj. To do so, we prove that 

Y.\\Qe[x,y)\\<cxi, (27) 

y 

for any integer x. Define Oq = (ao,/3o) with 

ao = inf(proj\ (0)) and = inf(proj2 (0)) , 
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where proj,-, i = 1,2 are the two projectors on the coordinates. Note that do does 
not necessarily belong to 0. However, it still belongs to the ballistic region {a > 
P + l}. For any a e (0, 1) and any integers x and y, we have 

which yields 

B(x + l + a,y + /3) < B(x+ 1 + ao,y + /3o), 

as well as 

Using the fact that the beta function is continuous on the compact set yields 
the existence of a constant C such that 

QB{x,y)<CQe,{x,y), 

for any integers x and y. Now recall that Qg[x,y) = QQ{x,y)(pg{x,y). Hence, 
using the last inequality and l l26t . it is sufficient to prove that 

Y.yQe„ix,y)<^, (28) 
y 

to get (23. We have 

E ( E >'Qeo M = E y^oo (y)< oo, 

X y y 

where the last inequality comes from the fact that do lies in the ballistic region 
and thus ;re„ possesses a finite first moment. Hence, f28l l is satisfied for any 
integer x which proves that l IZTt is satisfied. 

The second order derivatives of (pe are given by 

X 2. ^'^y 1 

dl<l>e (X, y) = - I . I ^^^^ . 

x+y ■^ 

dadpipeix.y) = 



^0 (fc + a + /3)2 ' 

and similar formulas for (3 instead of a. Thus, the second part of condition l fT4t 
for the Hessian matrix follows by arguments similar to those establishing the 
first part of {TU for the gradient vector. □ 

Proposition 3.7. In the framework of ExampleUM the covariance matrix Tq is 
positive definite, namely Assumption[^is satisfied. 
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Proof of Proposition \3. 71 One easily checks that 



(p0[X,X) = 



( 1 +. 

a+x a+x-1 



1 



1 



Vp+x-l 



3+X-2 



a a+fi+Zx a+p+Zx-l 

+ 1 1 I 

;6 a+/3+2x a+p+2x-l 



1 ^ 

]_ 



Hence, (/)e{0,0) is collinear to {p,-a)'^ and (/)e(x,x) {-log2,-log2)''' as x ^ 
oo. This shows that (pg{x,x),x e N, spans the whole space, and Proposition 12.8 
applies. □ 



Thanks to Theorem l2.6l and Propositions l3.6l and l3.7l the sequence {\/n{.9n-d*)} 
converges in P* -distribution to a non degenerate centered Gaussian random 
vector. 



4 Asymptotic normality 

We now establish the asymptotic normality of 0„ stated in Theorem 12.61 The 
most important step lies in establishing Theorem 12.51 that states a CLT for the 
gradient vector of the criterion (see Section l4JT l . As a consequence , we obtain 



the as ymptotic normality of On, following the proof of Theorem 5.23 in lvan der Vaart 



1998I) . This latter reference deals with i.i.d. observations only, but maybe easily 



generalized to our context as explained in Section l4T2l Finally Section lOl estab- 
lishes the proof of Proposition 12.81 stating a condition under which the Fisher 
information matrix is non singular. 



4. 1 A central limit theorem for the gradient of the criterion 



In this section, we prove Theorem l2.5l that is, the existence of a CLT for the score 
vector sequence Note that according to {9]l, we have 



n-l 



(29) 



where {Zfc}o<fc<n is the Markov chain introduced in Section [2l3l First, note that 
under Assumption[III]this quantity is integrable and centered vwth respect to P* . 
Indeed, recall that (pg{x,y) = QQ{x,y)/Qgix,y), thus we can write for all xeN, 



. . ^ Qfl*(x,y) 



Qe*[x,y) = dgl^ ^ Qeix,y) 



where we have used l fT3t to interchange sum and derivative. Then, 

E*((/)e*(Zfc,Zt+i)) = 0. 



(30) 
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Now, we rely on a CLT for ce ntered square-integrable martingales, see Theorem 
3.2 in Hall and Heyde 1980l) . We introduce the quantities 



Vl<A;<n, Un,k = —p<Pe*iZk-i,Zk) and S„,fc=^[/„j, 

as well as the natural filtration ^n,k = ^k '■= c(-^;>7 ^ k). According to JSOt . 
{Sn,k> 1 < < n, n > 1} is a martingale array with differences U„ k- It is also cen- 
tered and square integrable from Assumption [nil Thus according to Theorem 
3.2 in 



Hall and Heyde 1980ll . as soon as we have 



max II Un i II ' in P* -probability, (31) 

l<i<n ' n^+oo 
n 

E Un,i Ul . > Ze* in P* -probability, (32) 

1 = 1 

and {E*(maxf/„,, [/3,)} neM is a bounded sequence, (33) 

\<i<n 

with Zg* a deterministic and finite covariance matrix, then the sum Sn,n con- 
verges in distribution to a centered Gaussian random variable with covariance 
matrix Zg* , which proves Theorem 12.51 Now, the convergence l l32t is a direct 
consequence of the ergodic theorem stated in Proposition 12.41 Moreover the 
limit Tg* is given by fTsll and is finite according to Assumption [Till Note that 
more generally, the ergodic theorem (Proposition [ZU combined with Assump- 
tion |lll] implies the convergence of {T.i<i<n\\Un,i\\^}n to a finite deterministic 
limit, P* -almost surely and in Li(P*). Thus, condition l l33t follows from this 
Li{P*)- convergence, combined with the bound 



||E*(max Un.iUl^n <Y^E*{\\Un,if). 

l<i<n .^^ 



Finally, condition fSTT i is obtained by writing that for any £>0 and any ^ > 1, we 
have 

P*(max ||[/„,,-|| >£) = P*(max \\(Pg^{Zi.y,Zi)\\>£^) 

l<i<n l<i<n 

< E*(max \\(Pg*[Zi.i,Zi)f'^] 



n^e^q i<i<n 
1 



XE*(ll0e*(Z,-_i,z,-)l|2''), 



where the first inequality is Markov's inequality. By using again Assumption Hill 
and the ergodic theorem (Proposition l2.4l l, the right-hand side of this inequality 
converges to zero whenever q> I. This achieves the proof. 
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4.2 Proof of asymptotic normality 



We follow the pr oof of asymptotic no rmality result for M-estimators stated in 
Theorem 5.23 in Ivan der VaartI Il998ri in a i.i.d. context. Indeed, our estima- 
tor dfi maximizes the function 6 ^ = E ":^(/)fl(L'i | pL^) and converges m 
P* -probability to 6*. Moreover, it is shown in lComets et al.l l2012r) that the nor- 
malised criterion satisfies 



-£nid) ' 

n n—'+oo 



£m --TieHcpe), 



in P* -probability and the limiti ng function ^ has a unique maximum at 6* (see 
Theorem 4.1 and Section 4.4 in Comets et al. . 20121) . Under Assumption Hill we 
obtain the following 

1) The function 6 ^ (pe (x, y) is differentiable at 0* for all (x, y) and there ex- 
ists some positive function (p:N^ such that 

and for any 0i,02 in a neighborhood of 0*, for any (x, y) e N^, we have 

\(pe,(_x,y)-(pe.,{x,y)\ < \\(p{x,y]\\ ■ ||0i -02ll- 

2) The map 6 ^ £{d) admits a second order Taylor expansion at its maxi- 
mum 6* . 



If we moreover assume that the Fisher information matrix Zg* = -ng*{(f)g*) is 
non singular, then we have 



1 



n-l 



Vn x=o 



(34) 



where op(l) is a remainder term that converges in P* -probability t o 0. The proof 
of the latter fact is a simple rewriting of the proof of Theorem 5.23 in lvan der Vaart 
1998h and is therefore omitted. The main point is that the usual empirical pro- 
cess Gfi appearing in the original proof should be replaced here by its counter- 
part in our framework, namely the operator 



x=0 

for any (/) : IR or IR^ such that < +oo. Combining the equality in 

distribution between L",L"_^,...,L^ and the positive recurrent Markov chain 
{Zfc}o</t<n with the ergodic theorem (Proposition l2.4l l applied to this latter Markov 
chain, the operator (E„ satisfies 

^(E„((/)) = op(l). 
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which is the main ingredient of the proof. 

Finally, combining l l34t with Theorem l2.51 we obtain the convergence in P*- dis- 
tribution of {\/7iidn-0*)} to a centered Gaussian random vector with covariance 
matrix I.~}Z0^I,~} = 

4.3 Non degeneracy of the Fisher information 

We now turn to the proof of Proposition 12.81 Let us consider a deterministic 
vector ueW^. We have 

We recall that according to Proposition l2.3l the invariant probability measure uq 
is positive as well as ire . As a consequence, the quantity u^Hqu is non negative 
and equals zero if and only if 

Vx,yeN, u^<PQ{x,y) = Q. 

Let us assume that the linear span in IR*^ of the gradient vectors <pe (x, y), (x, y) e 
is equal to the full space, or equivalently, that 

Vect|aeE^(wjJ+^(l-wo)-*') : (x,y) e N^j = K'^. 

Then, the equality u^(pB (x, y) = for any (x, y) e implies w = 0. This concludes 
the proof. 



5 Numerical performances 



In Comets et al. 2012h . the authors have investigated the numerical performances 
of the MLE and obtained that this estimator h as better performances than the 
one proposed by lAdelman and Enriquea 120041) ■ being less spread out than the 
latter. In this section, we explore the possibility to construct confidence regions 
for the parameter 0, relying on the asymptotic normality result obtained in The- 
orem l2.6l Indeed, the limiting covariance Z^^ may be approximated by the ob- 
served Fisher information matrix 



(35) 



x=0 



The consistency of 0„ combined with Proposition l2.4l Theorem l2. Gl and Slutsky's 
Lemma first gives the convergence of Z„ to Zg* and then the convergence in 
distribution 



■jrdiOJd) under P* 
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where jVdiO,Id) is the centered and normaUsed ti-dimensional normal distri- 
bution. When d=l,we thus consider confidence intervals of the form 



Y,n ■ 



1l-r/2 3 '?l-7/2 
yt>n + 



1/2 



(36) 



where 1 - y is the asymptotic confidence level and qz the z-th quantile of the 
standard normal one -dimensional distribution. In higher dimensions {d > 2), 
the confidence regions are more generally built relying on the chi-square distri- 
bution, namely 

= {de&: n\\t]^Hd„ - d)f < Xi-y}, (37) 

where 1 -y is still the asymptotic confidence level and now Xz is the z-th quantile 
of the chi-square distribution with d degrees of freedom j^(c?). Note that the 
two definitions l l36t and i37\ coincide when d = I. Moreover, the confidence 
region (33 is also given by 



■,n = {de&: nidn - 0)Ti„{0„ - 0) < Xi- 



We present three simulation settings corresp onding to the three examples de- 
veloped in Section |3] and already explored in lComets et alj 120121) . For each of 
the three simulation settings, the true parameter value 0* is chosen according 
to Table[T]and corresponds to a transient and ballistic random walk. We rely on 
1000 iterations of each of the following procedures. For each setting and each 
iteration, we first generate a random environment according to vq* on the set of 
sites {- 10^, . . . , 10^}. Note that we do not use the environment values for all the 
10* negative sites, since only few of these sites are visited by the walk. However 
this extra computation cost is negligible. Then, we run a random walk in this 
environment and stop it successively at the hitting times r„ defined by lO , with 
n e {10^ fc : 1 < fc < 10}. For each stopping value n, we compute the estimators 
dn,^n and the confidence region My^n for y = {0.01; 0.05; 0.1}. 



Simulation 


Fixed parameter 


Estimated parameter 


Example[I] 


(fli,fl2) = (0.4,0.7) 


p* = 0.3 


Example[II] 




ip*,a*,a*) = (0.3,0.4,0.7) 


Example mil 




(a*,;6*) = (5, 1) 



Table 1: Parameter values for each experiment. 

We first explore the convergence of Z„ when n increases. We mention that the 
true value Zg* is unknown even in a simulation setting (since ng* is unknown). 
Thus we can observe the convergence of Z,2 with n but cannot assess any bias 
towards the true value Zg*. The results are presented in Figures [TJ [2] and |3] cor- 
responding to the cases of Examples HI HIl and Hill respectively. The estimators 
appear to converge when n increases and their variance also decreases as ex- 
pected. We mention that in the cases of Examples|I]and[lIl we have 1% and 1.3% 
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respectively of the total 10 * 1000 experiments for which the numerical maximi- 
sation of the likelihood did not give a result and thus for which we could not 
compute a confidence region. 




Figure 1: Boxplot of the estimator Z„ obtained from 1000 iterations and for val- 
ues n ranging in {10^ A; : 1 < < 10} in the case of ExampleHl 

Now, we consider the empirical coverages obtained from our confidence re- 
gions S^Y^n in the three examples and with y e {0.01,0.05,0.1} and n ranging in 
{10^ A; •.l<k< 10}. The results are presented in Table|2j For the three examples, 
the empirical coverages are very accurate. We also note that the accuracy does 
not significantly change when n increases from 10^ to 10^. As a conclusion, we 
have shown that it is possible to construct accurate confidence regions for the 
parameter value. 



n 


Example H] 


Example[ll] 


Example mil 


0.01 


0.05 


0.1 


0.01 


0.05 


0.1 


0.01 


0.05 


0.1 


1000 


0.994 


0.952 





899 


0.992 


0.953 





909 


0.977 


0.942 





901 


2000 


0.989 


0.952 





903 


0.994 


0.953 





910 


0.978 


0.928 





884 


3000 


0.988 


0.942 





901 


0.990 


0.938 





886 


0.981 


0.940 





889 


4000 


0.991 


0.944 





896 


0.991 


0.951 





894 


0.988 


0.945 





900 


5000 


0.990 


0.942 





896 


0.993 


0.942 





891 


0.986 


0.941 





883 


6000 


0.983 


0.948 





901 


0.987 


0.951 





888 


0.988 


0.937 





897 


7000 


0.986 


0.950 





900 


0.992 


0.951 





900 


0.986 


0.942 





898 


8000 


0.987 


0.956 





898 


0.988 


0.950 





903 


0.981 


0.946 





903 


9000 


0.990 


0.959 





913 


0.990 


0.949 





893 


0.985 


0.939 





901 


10000 


0.987 


0.954 





908 


0.990 


0.949 





899 


0.983 


0.944 





892 



Table 2: Empirical coverages of (1 - j) asymptotic level confidence regions, for 
7 £ {0.01,0.05,0.1} and relying on 1000 iterations. 
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Figure 2: Boxplots of the values of the matrix obtained from 1000 iterations 
and for values n ranging in {10^ A; : 1 < < 10} in the case of Example HH The 
parameter is ordered as 6 = {.61,62,03) = {p,ai,a2) and the figure displays the 
values: Z„(l, 1);Z„(2,2);Z„(3,3);Z„{1,2);Z„(1,3) andZ„{2,3), from left to right 
and top to bottom. 
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for sharing many fruitful reflexions about this work. 

References 

Adelman, O. and N. Enriquez (2004). Random walks in random environment: 
what a single trajectory tells. Israel J. Math. 142, 205-220. 

Benjamini, I. and H. Kesten (1996). Distinguishing sceneries by observing the 
scenery along a random walk path. /. Anal. Math. 69, 97-135. 

Chernov, A. (1967). Replication of a multicomponent chain by the lightning 
mechanism. Biofizika 12, 297-301. 

Comets, F, M. Falconnet, 0. Loukianov, D. Loukianova, and C. Matias (2012). 



23 



-HBBBBBBB 



BBbbb 



-RBBBBbbb 



Figure 3: Boxplots of the values of the matrix Z„ obtained from 1000 iterations 
and for values n ranging in {10^ k : 1 < < 10} in the case of Example Hill The 
parameter is ordered as 6 = {61,62) = (a,/3) and the figure displays the values: 
i„ (1, 1); i„ (2, 2) and £„ (1,2), from left to right. 



Maximum likelihood estimator consistency for ballistic random walk in a 
parametric random environment. Technical report, arXiv:1210.6328. 

Hall, P. and C. C. Heyde (1980). Martingale limit theory and its application. New 
York: Academic Press Inc. [Harcourt Brace Jovanovich Publishers]. Probability 
and Mathematical Statistics. 

Hughes, B. D. (1996). Random walks and random environments. Vol. 2. Oxford 
Science Publications. New York: The Clarendon Press Oxford University Press. 
Random environments. 

Kesten, H., M. V Kozlov, and F. Spitzer (1975). A limit law for random walk in a 
random environment. Compositio Math. 30, 145-168. 

Kozlov, M. (1973). Random walk in a one-dimensional random medium. Theory 
Probab. Appl. 18, 387-388. 

Lowe, M. and H. Matzinger, III (2002). Scenery reconstruction in two dimen- 
sions with many colors. Ann. Appl. Probab. i2(4), 1322-1347. 

Matzinger, H. (1999). Reconstructing a three-color scenery by observing it along 
a simple random walk path. Random Structures Algorithms 15{2), 196-207. 

Revuz, D. (1984). Markov chains (Second ed.). Volume 11 of North-Holland 
Mathematical Library. Amsterdam: North-Holland Publishing Co. 

Solomon, F. (1975). Random walks in a random environment. Ann. Probability 3, 
1-31. 



24 



Temkin, D. E. (1972). One -dimensional random walks in a two-component 
chain. Soviet Mathematics Doklady 13, 1172-1176. 

vanderVaart, A. W. (1998). Asymptotic statistics,Yolume 3 of Cambridge Series in 
Statistical and Probabilistic Mathematics. Cambridge: Cambridge University 
Press. 

Zeitouni, O. (2004). Random walks in random environment. In Lectures on prob- 
ability theory and statistics, Volume 1837 of Lecture Notes in Math., pp. 189- 
312. Berlin: Springer. 



25 



